Stratified Sampling Design Based on Data Mining
نویسندگان
چکیده
منابع مشابه
Stratified Sampling Design Based on Data Mining
OBJECTIVES To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. METHODS We performed k-means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the varia...
متن کاملStratified and Un-stratified Sampling in Data Mining: Bagging
Stratified sampling is often used in opinion polls to reduce standard errors, and it is known as variance reduction technique in sampling theory. The most common approach of resampling method is based on bootstrapping the dataset with replacement. A main purpose of this work is to investigate extensions of the resampling methods in classification problems, specifically we use decision trees, fr...
متن کاملEvolutionary Programming Based Stratified Design Space Sampling
Recently there have been advances in strati ed sampling techniques that attempt to enforce equal distributions not only across the design variables, but also onto the design space itself. This requires a numerically intensive optimization routine. Until now, no optimization strategy was able to distribute sample points evenly in the design space, but Evolutionary Algorithms (EA) act as an enabl...
متن کاملStratified Sampling for Association Rules Mining
It is well recognized that mining association rules in a very large database is usually time consuming due to the I/O overhead in scanning the disk resident database. As one of the techniques for reducing the I/O overhead, sampling for mining association rules has been actively investigated during the last few years. Each sampling method and algorithm proposed in the literature has its own meri...
متن کاملSupport Vector Machine based on Stratified Sampling
Support vector machine is a classification algorithm based on statistical learning theory. It has shown many results with good performances in the data mining fields. But there are some problems in the algorithm. One of the problems is its heavy computing cost. So we have been difficult to use the support vector machine in the dynamic and online systems. To overcome this problem we propose to u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Healthcare Informatics Research
سال: 2013
ISSN: 2093-3681,2093-369X
DOI: 10.4258/hir.2013.19.3.186